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ABSTRACT 

We describe a simple fully analytic model of the excursion set approach asso- 
ciated with two Gaussian random walks: the first walk represents the initial 
overdensity around a protohalo, and the second is a crude way of allowing for 
other factors which might influence halo formation. This model is richer than 
that based on a single walk, because it yields a distribution of heights at first 
crossing. We provide explicit expressions for the unconditional first crossing dis- 
tribution which is usually used to model the halo mass function, the progenitor 
distributions from which merger rates are usually estimated, and the conditional 
distributions from which correlations with environment are usually estimated. 
These latter exhibit perhaps the simplest form of what is often called nonlocal 
bias, and which we prefer to call stochastic bias, since the new bias effects arise 
from 'hidden- variables' other than density, but these may still be defined locally. 
We provide explicit expressions for these new bias factors. We also provide for- 
mulae for the distribution of heights at first crossing in the unconditional and 
conditional cases. In contrast to the first crossing distribution, these are exact, 
even for moving barriers, and for walks with correlated steps. The conditional 
distributions yield predictions for the distribution of halo concentrations at fixed 
mass and formation redshift. They also exhibit assembly bias like effects, even 
when the steps in the walks themselves are uncorrelated. Our formulae show 
that without prior knowledge of the physical origin of the second walk, the naive 
estimate of the critical density required for halo formation which is based on 
the statistics of the first crossing distribution will be larger than that based on 
the statistical distribution of walk heights at first crossing; both will be biased 
low compared to the value associated with the physics. Finally, we show how 
the predictions are modified if we add the requirement that halos form around 
peaks: these depend on whether the peaks constraint is applied to a combination 
of the overdensity and the other variable, or to the overdensity alone. Our results 
demonstrate the power of requiring models to reproduce not just halo counts but 
the distribution of overdensities at fixed protohalo mass as well. 
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1 INTRODUCTION 



The excursion set approach, pioneered by lEpstein 



1983 ) and developed sub s tantially by iBond et al 
199lh . lLacev fe Coll \ 19931 ). iMo fe White! (|l996l ) and 



Sheth (|l998t ) yields important insight into various 
features of hierarchical clustering. Although recent 
work has highlighted the limitations of this approach 
(|Paraniape fe Shethl. l2012h . the limitations are primar- 
ily of a quantitative rather than qualitative nature. 
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The approach combines the statistics of the initial 
density fluctuation field with the physics of spherical or 
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triaxial collapse, to make predictions for the abundance 
of virialized objects as a function of time. This means 
that it provides information about merger rates, the high- 
redshift progenitors of objects of fixed mass at a later 
time, the tendency for the mass function in dense regions 
to be top-heavy, and hence how the spatial clustering of 
these objects depends on their mass. 

In the spherical collapse model, the evolution of an 
object is determined by its own overdensity. This enters 
in the excursion set approach as follows. One associates 
a one-dimensional random walk with each position in 
space; this walk shows how the initial overdensity de- 
pends on the smoothing scale over which the density is 
averaged. The largest scale on which this walk exceeds the 
critical density required for spherical collapse contains a 
mass; this is the excursion set estimate of the mass of the 
object in which this particular position in space will end- 
up. Therefore, in this approach, the technical problem to 
be solved is that of the first crossing distribution of a 
barrier whose height may depend on the number of steps 
taken by the one-dimensional random walk. The statistics 
of the initial fluctuation field determines the ensemble of 
walks over which to average. 

In triaxial collapse models, the evolution of an ob- 
j ect is determined by more than its initi al overdensity 
|Bond fc Myers!, Il99fj ; ISheth et all 1200 ll ). In the con- 
text of such models, it is natural to ask how these extra 
parameters enter the excursion set approach. It should 
come as no surprise t hat each add i tional variable simply 
adds an extra walk JSheth et al.l . l200ll ; IChiueh fc Led . 
l200ll ; ISheth fc Tormenl . 120021 '), but there is no guarantee 
that these variables are Gaussian distributed. As a re- 
sult, the technical problem becomes one of first crossing 
a multi-dimensional barrier by multi-dimensional walks. 
However, it has recently been realized that this has non- 
trivial, qualitatively different, consequences for halo bias: 
in effect, the correlations between these other parameters 
on the large scale density field introduce wh at are known 
as nonlocal bias effects 1 Sheth et al.l . 12012^ 1 . In this re- 
spect, the multi-dimensional excursion set approach is 
considerably richer than the one-dimensional one. 

The main goal of this paper is to illustrate a num- 
ber of these qualitatively new features of the multi- 
dimensional excursion set approach. Our goal here is not 
so much to develop a model which reproduces effects seen 
in simulations, as to develop insight: therefore, the em- 
phasis is on developing a fully analytic model in which 
it is easy to see the origin of these new effects. It turns 
out that this mode l may not be th a t unr ealistic - this is 
explored further in lAchitouv et all 1 20131 ). 

Section [2] describes our model and provides expres- 
sions for the usual excursion set approach quantities, as 
well as for the qualitatively new ones. Section [3] describes 
a number of extensions, including an explicit calcula- 
tion of how all the predictions are modified if protohalos 
are identified with peaks in the initial field. We use this 
to demonstrate how requiring models to reproduce both 
halo counts as well as overdensities at fixed halo mass 
provides sharp constraints. A final section summarizes. 



2 TWO INDEPENDENT GAUSSIAN 

WALKS WITH UNCORRELATED STEPS 

Let S and g both denote zero-mean Gaussian variables, 
with variance (S 2 ) = s and {g 2 } = fi 2 s respectively. When 
plotted as a function of s, these represent walks associ- 
ated with the overdensity and the second variable which 
matters for collapse. We will assume that 5 and g are 
independent: (Sg) = 0. 

We will use f(s) to denote the distribution of s when 



5 > 6c(s) + g 



(1) 



for the first time. We will also be interested in p(<5i x |s), 
the distribution of walk heights at first crossing. The ex- 
cursion set ansatz assumes that the quantity /(s) is re- 
lated to the mass fraction in halos having mass m(s) by 



/Wd , = mdnH 
p am 



(2) 



where dn/dm is the comoving number density of halos of 
mass m, and p is the comoving background density. 

2.1 Rotation of coordinate system 

When the inequality |T]) is saturated, it defines a line in 
the (S, g) plane. The clearest way to think of this prob- 
lem is to change variables to ones which run parallel and 
perpendicular to this line. Therefore, define 

5~g , PS + g/p 

and 9+ = —/===• (3) 



AT+^ 



VT+W 2 



Notice that (g'L) = {g\) = s, and that these variables are 
independent: 



(9+9-) = IHS 2 } - ^ = 0. 



(4) 



In these variables, g~ steps towards or away from the 
barrier, which has height <5 c (s)/yl + ft 2 , and g+ steps 
parallel to it. 

For what follows, it is useful to note that 



s= g-+£g+ and g = p9+ 



VT+W 2 



(5) 



2.2 Unconditional first crossing distribution 

The independence of g+ and g- means that f(s) de- 
pends only on Since g- is just a one dimensional 
gaussian walk, and it must cross a barrier of height 
5 c {s) / y/T+~jP , the first crossing distribution is that for 
a moving barrier, for which simp le approximations are 
available jSheth fc Tormenl. 12003 ). 

For the special case in which S c does not depend on 
s, the first crossing distribution is 



vf(v p ) _ , exp(-i/ 2 /2) 



2tt 



where 



2 _ s c (oy/s _ 2 

V = A M = V/3- 



l + p 2 



(6) 



(7) 
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Notice that P = yields the usual one-dimensional solu- 
tion. 

Notice also that the factor 1 + p 2 can be viewed in 
either of two ways. Either it rescales the barrier height 
(which is how it appeared in the analysis above) or it 
rescales the variance s. Now, the first crossing distribu- 
tion /(s)ds is usually equated with the mass fraction in 
halos of mass m (equation [2]). If S c itself is expected to be 
related to the physics of halo formation, then the rescal- 
ing of S c means that one must also understand the physics 
which led to f3 7^ if one wishes to derive the value of 
8 C from halo abundances. Failure to do so will lead to 
a misestimate of the true value of the value of S c which 
matters for the physics. If we require 8 C ~ 1.686, then 
matching halo counts req uires (1 + /3 2 ) -1 ^ 0.7 so /? w 0.6 
jSheth fe Tormenl . ll999h . 



2.3 Distribution of height at first crossing 



Define <5i x to be the value of S when g- — S c (s) / \/l + P 2 . 
Then 



6 c (s)/yTTW + Pg+ _ 8o(s) p 



+ 



g+ 



1 + P 2 ^T+W 2 



(8) 



Since g+ is just a Gaussian with zero mean and variance 
s (recall it is independent of g_), the expression above 
shows that 



P(3ix\s) 



2^E 2 X 



where 



^lx 



Sc(s) 
1 + P 2 



and Ei 



l + P 2 



(9) 



(10) 



The limit = yields a delta- function centered on S c (s) 
as it should. 

If we set uix = Six I a where a = s, and recall from 
equation ([7J) that vp = (5 c (0)/a)/ \/l + P 2 , then it is 
useful to think of the distribution above as p(vix the 
conditional distribution of v\ x given vp: in this case, the 
expression above is the standard expression for the condi- 
tional Gaussian distribution with correlation parameter 

(i+/? 2 r i/2 . 

Note that equation ([8]), and hence equation ([9]) are 
exact even when 8 C depends on s. In this respect, the 
distribution of S\ x at first crossing is much simpler than 
is the first crossing distribution itself - it always has a 
Gaussian shape, with the barrier only affecting the mean 
value of this Gaussian. 

It is also worth noting that (<5i x |s) = [ii x is guaran- 
teed to be less than 5 C . Thus, without prior knowledge of 
the value of P, the statistical distribution of Si x will lead 
to a misestimate of the value of 8 C which is associated 
with the physics. In this context, it is useful to think in 
terms of the distribution of differences from S c . If we de- 
fine Aix-c = Si x — S c (s), then it is Gaussian distributed 
with mean -<5 c (s) p 2 /(l + p 2 ) and variance s/3 2 /(l + p 2 ). 
I.e., the mean is 8 c (s) times the same factor by which s is 



rescaled. This provides a simple operational way of deter- 
mining the value of /? from a measurement of p(Ai x _ c |s). 

2.4 Distribution of the barrier at first crossing 

Similarly, define gi x to be the value of g at first crossing. 
Then, because gi x = 5i x — 5 C , it has the same distribution 
as <5i x , but with a shifted mean. Specifically, p(gi x ) will 
be Gaussian with mean 
sP 2 /(1 + P 2 ). 



-5 c (s) P /(l + P ) and variance 



2.5 The two-barrier problem and progenitor 
distributions 

Symmetry means that the distribution of Si at which 

5>5 cl +g (11) 

for the first time, given that inequality {TJ was first sat- 
isfied on scale So < 5*1 , is given by equation © but with 
v 2 replaced by 



2 

^10 



(5d — Sco) 2 



(5 1 -5 )(1+/? 2 )' (12) 

The limit P = yields the usual expression for progenitor 
distributions associa ted with one-dimensional walks (e.g. 
lLacev fc Cole] . Il993f ). 

Halo formation is often identified with the time when 
at least half the total mass has been assembled in pieces 
that are each more than fi times the final mass. For \i > 
1/2, there can be only one such piece so the formation 
time distribution is given by 

r M M 

p(8 cf > S cl \M, Sco) = dm — f(m, S cl \M, 5 c0 ) (13) 

J fiM m 



(|Lacev fc Cold . Il993l l. For white noise initial conditions 
(s oc m _1 ) and fj, = 1/2 this becomes 

p(u f ) = 2u f erfc(o; / /\/2) (14) 

where oj/ = i/fo with 1//0 given by equation (|12|l . Be- 
cause Ldj includes a factor of 1 + p 2 , the mean forma- 
tion redshift will be scaled to higher values than when 
P — 0. This sort of rescaling yiel ds better agreement 
with measurements in simulati ons ( Giocoli et all 120071 ; 
iMoreno. Giocoli fc Shethl.l2008l ). See lshethl |201ll ) for the 
case fj, < 1/2. 

2.6 Conditional distributions and correlations 
with environment 

Similarly, the distribution of s at which inequality (JTJ) 
is first satisfied, given that S has height A on some scale 
S < s, but G is unconstrained (except by the requirement 
that A — G < S c o), is also given by equation ((6]) but with 
v 2 replaced by 



2 



(S c0 - A) 2 
s(l + p 2 )-S' 



(15) 



(The Appendix provides a short derivation.) This can be 
thought of as subtracting from the variance s(l + p 2 ) the 
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environment 

= 



progenitors \ 




ln(y=<5= /s) 



Figure 1. First crossing distributions for our two-dimensional 
walks with 1 = 0.3 which are conditioned to first pass 
through A = on scale S = 0. 1<5^q (histogram); smooth 
curve shows our prediction (equation 1151 in equation [6jl . The 
effective cosmology of this environment has critical density 
5c0; dashed curve shows the progenitor distribution with this 
same effective cosmology (eouation ll2l in equation^. For one- 
dimensional walks, the solid curve would be the same as the 
dashed one. 



piece which comes from constraining S — A on scale S, 
which makes its correspondence to the one-dimensional 
expression (the /3 = limit of this expression) obvious. 

Because equation ()12p is different from (|15[). when 
expressed as a function of s rather than u, the condi- 
tional distribution is different from the progenitor one, 
whereas they are the same for one-dimensional walks. 
The difference between the two is largest in the s — > S 
limit, where the conditional distribution predicts more 
objects than does the progenitor distribution. Figure [T] 
illustrates, self-similar distribution. Thus, a discrepancy 
between the progenitor and environmental dependences 
of clustering provides a simple way to see if stochasticity 
has played a role in determining halo abundances. 

Things are slightly more complicated if 5 c (s), of 
course, but the basic fact that progenitor and conditional 
distributions with S c i — 5 c o = Sco — A will no longer be 
the same is generic. 

2.7 Stochastic (nonlocal) bias 

The distribution of s at which inequality |T]) is first sat- 
isfied, given that the walk was at (A, G) on scale S < s, 
is also given by equation ((6| but with v 2 replaced by 



2 



(<5 c0 - A + Gf 



{8-s)(i + py (16) 

This follows from the fact that the distance from a point 
(xo,Vo) to the line ax + by + c = is \axo + byo + 



c\/\/a 2 + b 2 . Alternatively, one can view this as the same 
shift of origin to the <j- walk that i s mad e in the one- 
dimensional case (e.g. lLacev fc Cole] . 1 19931 ). The expres- 
sion above shows that G can affect halo abundances in 
qualitatively the same way that A can. 

In more detail, the halo overdensity is defined by the 
ratio of the condi t ional expression to the unconditional 
one (|Mo fc White! . [l996h . In our case, this means that 



l + S h (u\A,G) = 



(17) 



The peak-background split bias factors are the coeffi- 
cients in the Taylor series expansion of the expression 
above, in the limit where s 3> S. If we write these as 



, c n A 8 G 3 

i,3 



(18) 



then the dependence on G gives rise to what is known as 
nonlocal bias. Since G may also be determined by local 
quantities, this is, in general, a misnomer. Since it is really 
an effect which arises from the dependence of halo counts 
on the 'hidden' stochastic variable G, we think it is more 
accurate to call this 'stochastic' bias, which may or may 

not be local. 

Recently. iMusso et al.l (|2012l ) have shown that cross- 
correlating the halo overdensity field with the nth- 
order Hermite polynomial H„(A/(A 2 ) 1//2 ) is an effi- 
cient way of reconstructing the b n coefficients even when 
(A 2 ) 1 / 2 is not small. In our case, cross-correlating with 
//»(A/(A 2 ) 1 / 2 )i/,(G/<G 2 ) 1 / 2 ) yields 



Bi 



u ..i+i- 



n+i+i(u), (19) 

where v 2 = ((5 2 n /s)/(l + 2 ). This reduces to the usual 
expression (|Mo fc Whitel . ll996l : [Musso et al.l . l2012T ) when 
j = 0: 



ck i k- 

b c b k — v 



H k +i{v). 



(20) 



Since the dependence of equation (|16[) on G is the same 
as that on A, cross-correlating with H n (G/ (G 2 ) 1 ^ 2 ) alone 
yields 



Bn 



Ck 



(■ 



(21) 



In this respect, the stochastic (pos sibly nonloc a l) bia s 
model here is simpler than that in ISheth et all <|2012h . 
where the analogue of G was not Gaussian distributed (so 
the associated orthogonal polynomials were more compli- 
cated). 

2.8 Assembly bias 

Assembly bias is the correlation between properties of 
protohaloes of fixed mass and their environment , such 
as those first identified by ISheth fc Tormenl (|2004l ). and 
studied since by many others. While it is generally be- 
lieved that this effect should be absent i n exc ursion set 
models with uncorrelated steps ( Whitd . 1 1996h . we now 
show that our two-dimensional model does exhibit as- 
sembly bias, even though the steps in the walks are un- 
correlated. However, we caution that we are not claiming 
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tion shows no assembly bias when the steps in the walk 
are uncorrelated. 

Figure [2] illustrates the effect: objects which are sur- 
rounded by large scale overdensities tend to have larger 
Six than objects of the same mass in large-scale under- 
densities. Since they have above average initial overden- 
sities on scale s, they will also tend to have above aver- 
age overdensities at formation (typically, on scale ~ 2s). 
The result is a correlation, at fixed halo mass, between 
the density at formation and environment - even though 
there will not be a correlation between formation time 
(rather than the overdensity at the formation time) and 
environment. (In this model, as for the one-dimensional 
case, any correlation between formation time and larger 
scale environment can only come from correlations be- 
tween steps.) Since the density at formati on is correlated 
with halo concentration at virialization ( Navarro et al.l . 
1 19971 ). our model predicts a correlation between halo con- 
centration and environment at fixed mass. 



Figure 2. Dependence of walk height at first crossing, <5i x , on 
large scale environment. Symbols with error bars show the dis- 
tribution of Si x f°r walks which first cross each other on scale 
s, and which had height A on scale S < s; smooth dashed 
curves show equation i'2'2\) . Black histogram shows the corre- 
sponding unconditional distribution for the same value of s; 
smooth solid curve shows the corresponding prediction (equa- 
tion 0. 



that this model explains assembly bias; simply that as- 
sembly bias is part and parcel of the multi-dimensional 
excursion set approach, even for walks with uncorrelated 
steps. 

The distribution of walk heights at first crossing, 
given that S = A on scale S, is 



P (S lx \s,A,S) 



-(<5 lx -A- MA ) 2 /2£l 



where 



PA 



S c (s) - A 
1 + P 2 



and 



v 2 



(g - S)P 2 
1 + /3 2 



(22) 



(23) 



This is the conditional analogue of equation ([9]). 

This shows that the variance is smaller than it is for 
unconditioned walks, but that the difference is negligible 
when s ^> S. The mean is more interesting: 



(<5ix |s, A, S) =A + [i A = 



S c (s)+p 2 A 
1 + P 2 



(24) 



is shifted by A/3 2 /(l + p 2 ) compared to the uncondi- 
tional mean. Even more suggestively, this implies that 
(A lx _ c |s,A,5) = [A - S c (s)]l3 2 /(1 + p 2 ). The depen- 
dence of this mean on the larger scale A is this model's 
expression of assembly bias, and is an important way in 
which the two-walk problem differs from the one-walk 
problem. When /3 = the distribution becomes a delta- 
function centered on <5 C ; since it is therefore independent 
of A, this shows explicitly that the one-dimensional solu- 



3 EXTENSIONS 
3.1 Correlated steps 

Our change of variables from S, g to walks which step par- 
allel and perpendicular to the barrier makes it straight- 
forward to see what should happen when both S and g 
are walks with correlated steps. If the correlations are 
the result of smoothing with the same filter, then the un- 
conditional distribution f(s) should be replaced wit h the 
corresponding expression in iMusso fc Shethl (|2012f ) (see 



discussion following equation 132 p , but the distribution of 
the walk height at first crossing, p(Si x \s), remains un- 
changed. This is because, at first crossing, <5i x depends 
only on g+ (by definition), and g+, although it has corre- 
lated steps, is independent of p_ , so it is not constrained 

by the fact th at o_ = 8 C /(1 + P 2 ). 

Following IParaniape et all (|2012h the progenitor dis- 
tribution should be well-approximated by replacing S c i — 
S c0 -» 5d - (S x /S)5 c0 and (s - S)(l + p 2 ) -»• [s - 
(S X /S) 2 S](1 + p 2 ), and the conditional distribution by 
replacing 5 c o - A -> 5 c0 - (S x /S)A and s(l + P 2 ) - S -> 
s(l + p 2 ) — (S X /S) 2 S. Still more accurate expressions 
follow from making the co rresponding replace ments in 
the expressions provided in IMusso et ail (|2012h . Testing 
these expressions is the subject of work in progress. 



3.2 Correlated Walks 

Suppose instead that steps in S are uncorrelated, whereas 
steps in g are correlated with those in 5. This may hap- 
pen, for example, if the critical density for collapse de- 
pends on the overdensity on a larg er scale, e.g. in the c or- 

' Bower et aTTl[l993h 



related galaxy formation model of L . 
in theories of modified gravity (|Lam fc Lil. 120121 ). Then, 
let 



(Sg) 



{$ 2 ) 1/2 {g 2 ) 1/2 



(25) 
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denote the correlation parameter between S and g. If 
we make the same coordinate transformation as be- 
fore, then (gi) = s(l + p 2 - 2pP)/(l + f3 2 ), (g%) = 
s(l + p 2 + 2pP)/(l + p 2 ) and <<?+<?_> = s p (1 - p 2 )/(l + 
p 2 ). We can always write p(g-,g+) = p(g_) p(g + |p_), 
where p(g+|g_) 7^ p(<?+) is a Gaussian distribu- 
tion with mean g- (g+g-) / (g 2 .) and variance {g\) [1 — 

(9+9-)V(9 2 -}{9 2 +)]. " 

Since the first crossing distribution depends on p(g-) 
and not on g+, it is given by the same expression as for 
uncorrelated walks, but with v 2 = (S 2 /s)/(l + f3 2 +2p/3). 
This shows that the amount by which S c appears to be 
rescaled depends on /3 as well as the correlation parame- 
ter. 

However, p(5i x |s) will be affected. Namely, at first 
crossing, Six is given by equation Jjj), so 



where, because <?_ 



(g+\g-) = p 



1 + P 2 ' ^T+W 

Sc/ '(1 + /3 2 ) at first crossing, 
1 -P 2 



(26) 



(27) 



1 + P 2 r 1 + P 2 — 2/9/3 ' 
Therefore, p(<5i x |s) is Gaussian with mean and variance 



l + P 2 



l + P 



/3(l-/3 2 ) 
1 + /3 2 — 2/3p 



/3 2 (1 



(28) 
(29) 



(1 + /3 2 - 2y9p) ' 

For p — 0, this reduces to equation for p = 1 or —1, 
corresponding to complete correlation or anti-correlation, 
the distribution becomes a Dirac delta function centered 
on p — 1/(1 — P) or 1/(1 + P), respectively. (This can be 
understood simply from the fact that, in these limiting 
cases, the two-dimensional walk is confined to a line, and 
this line can only cross the line defined by the barrier at 
a single point.) 

It is a curious fact that when P = 1 (the two walks 
have the same variance), then there is no shift to the 
mean, and the variance becomes s (1 + p)/2. This can be 
traced back to the fact that, when p = 1, then (g+g-) = 
0; i.e., the walks in g- and g+ are independent (even 
though 5 and b are correlated), but they have different 
variances. 

But in general, correlations between the walks lead 
to a shift in the mean and a rescaling of the variance. 
However, they do not change the fact that p(8\ x \cr) is 
Gaussian. In practice, one should be able to determine 
if p 7^ because the three unknowns, S c , P and p can 
be determined from our expressions for the mean and 
variance of p(Si x \a) and the required rescaling of s in the 
first crossing distribution f(s). 

3.3 Higher-dimensional walks and/or other 
distributions 

Our fundamental assumption, that equation {TJ accu- 
rately captures the physics of collapse, is, of course, only 
an idealization. Note, however, that if other variables also 



mattered, and they were also Gaussian distributed, such 
that equation (JXJ> becomes 



S > S c (s) + Y,9i 



(30) 



then, because the sum of Gaussians is itself Gaussian, 
this n-dimensional model reduces to the 2-dimensional 
one we have just solved, with p 2 = X/ItTi Pi- 
Alternatively, suppose instead that 



5 > &c + x, 



(31) 



where \ follows a non-Gaussian distribution. E.g., 
ISheth et al] (|2012h study a model in which S c is indepen- 
dent of s, but \ 2 is drawn from a chi-squared distribution 
with five degrees of freedom. However, this distribution 
has a mean which depends on s. If the distribution of 
Ax = X ~ ix) is n °f f 00 different from a Gaussian, then 
we can use our 2-dimensional Gaussian model as a reason- 
able approximation to this one, with S c (s) in equation ([T]) 
equal to S c + (x) an d g a zero-mean Gaussian variate 
having the s ame v ariance as Ax- E.g., for the model in 
ISheth et all (|2012l ). ( X ) ~ 0.95 Vs and ((A x ) 2 ) ~ 0.09s. 
I.e., this model should be reasonably well approximated 
by our two-Gaussian model with S c (s) = S c + 0.95y/s and 
P 2 = 0.09. 

This has the following interesting consequence. At 
first crossing, the distribution of Ax will be like that 
of gi x , meaning that it should have mean and vari- 
ance approximately given by — S c (s)p 2 /(I + p 2 ) and 
sp 2 /(l + P 2 ). Since the variance of the initial variate 
X was P 2 s, one should think of xix as having variance 
reduced by (1 + p 2 )^ 1 . For it to still have approxi- 
mately the same functional form as x itself, it should 
have mean 0.95-v/ s/(l + (3 2 ), which is smaller than the 
original value of 0.95-^5. For P <g 1, we can think of this 
as a shift in the mean by — 0.95y A s/3 2 /2. The actual shift, 
-5 c (s)p 2 /(l + p 2 ), has the same sign, but a different am- 
plitude, indicating that the distribution of xix will n °t 
be quite the same as that of x itself. 

We end this discussion with a word of caution: Al- 
though mapping to an effective Gaussian is useful, it may 
hide interestin g physics. Fo r exam ple, the non-Gaussian 
stochasticity in ISheth et al.1 (|2012h results in a quadrupo- 
lar signature for Lagrangian space halo bias; using an 
effective Gaussian obscures the origin of this angular de- 
pendence. 



3.4 Excursion set peaks 

For walks associated with peaks in S — g, one must 
simply add a weight which depends on d(S — g)/ds 
I Musso fc Shethl l2012h . The associated first crossing 
distribution becomes tha t for excursion set peaks 
|Paraniape fc Shethl l2012h . provided we remember to 
rescale S c (s) — > 8 c (s)/ + P 2 , because the peaks are 
in g- rather than in S. Namely, 



sf(s) = 



exp(-j/|/2) 



27 V2tt 



da; xp{x\ r yv/3) 



(R,/R) 3 



(32) 



© 0000 RAS, MNRAS 000, [TJl] 



Two-dimensional excursion set 7 




Figure 3. First crossing distribution for all walks when steps 
are uncorrelated (solid); when steps are correlated because of 
Gaussian smoothing and the power-spectrum is P(k) oc fc -1,2 ; 
when the walks are centered on peaks in 5 — g (equation 132 1> 
and on peaks in 5 only (equation I34H . 
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Figure 4. Distribution of walk height at first crossing, 5i x , 
for all walks (dotted) and for walks which arc also peaks in 
5 on scale s (solid); i.e., equations (J9j and (136 1 ) respectively. 
The distribution for peaks in <5 — g is also given by the dotted 
curve. 



where vp = (S c /cr) / yfl + /3 2 as before (c.f. equation [7]), 
th e parameters 7 and R * are defined by equation (4.6a) 
in IBardeen et al.1 (| 19861 ). 



p(a;|7, Vp) were defined above, and 



e -( 3: -7^) 2 /2(l-7 2 ) 

p(x\j, vp) = - (33) 

V27t(1 -7 2 ) 

is the usual conditional Gaussian (i.e. 7 is the correlation 
coefhcent bet ween x and ^a). and J- (x) is given by equa- 
tion (A15) of iBardeen et all l| 19861 ). (The Musso-Sheth 
approximation for the first crossing distribution for all 
walks with correlated steps has J-{x) = 1 and R — R,.) 

The distribution of Six is then unchanged from that 
for all walks (equation [9]), because a constraint on the 
'velocity' of g_ , which is what t he peaks constraint boils 
down to ( Musso fc Shethl . 120121 ) , means nothing for <?+ , 
which is what determines Sxx ■ The statistics of walks 
centered on a randomly chosen particle within a pro- 
tohalo are known to be different from those centered 
on the protoh alo center of mass; the latter yield larger 
values of Six JSheth et al.1 . l200ll : lAchitouv etafl 120131 : 
iDespali et aill2013l ). Therefore, the analysis above indi- 
cates that a model which identifies protohalo centers of 
mass with peaks in S — g cannot explain this difference. 

If we identify protohalo centers of mass on scale s 
with positions where S first exceeds 5 C + g and are peaks 
in S (rather than in S — g) on that scale, then the first 
crossing distribution becomes 

s f( s ) = ~ ~7~~ , ^^i- f dxxp(x\-y,up)G {x,'y fj x) 



27(7?,/i?) 3 v2^f 

(34) 

where we have defined jp = (1 + /3 2 ) , ^s,7,7?» and 



G„(x,7 (3 : 



dyp{yh l3 ,x)J r (y)y n (35) 



with J-(y) the same quantity that ap pears in equa- 
tion (13211 . i.e., given by equation (A15) of IBardeen et al] 
l|l986T ). Similarly, a little algebra shows that in this case 
the distribution of <5i x is given by 

P P k(<5ix js) = A lx p(S lx \s) [Gi - 7(1/1* - v c ) Go] , (36) 

where p(5ix\s) is the distribution for all walks (equa- 
tion [9]), G n {vix,^Vvx) is given by equation (f35|) . and 
Aix = [\/l + /3 2 / Axxp{x\y,up)Go{x,y l3 x)Y 1 is a nor- 
malization factor which ensures that the integral over all 
Six yields unity. 

In the limit /3 — > 0, the distribution p(y\y /3 ,x) be- 
comes sharply peaked around its mean value j^x — > x, 
so that Go(x,~{pX) — > F{x). Thus, in this limit, equa- 
tion (I34|l reduces to equation (|32[) . Similarly, p(Si x \s) be- 
comes a delta function centered on S c /(1 + /3 2 ), making 
P P k(<5ix|s) -> Aix p(5ix\s) Gi. Since A tx -> G^ 1 in this 
limit, Ppk(<5ix|s) — > p(Six\s) as it should. 

In general, at large i^i x , Gi /Go — > JVix making 
P P k(8ix\s) oc p(8ix |s) Go(^ix , 7^ix ); this illustrates that 
the term in square brackets acts to skew the distribu- 
tion towards larger S\ x . Figures [3] and [4] show this explic- 
itly: they compare sf(s) and p(5ix \ s) for these two peak 
models with that for al l walks. In practice, w e use equa- 
tions (4.4) and (6.13) of lBardeen et al l l|l986T l to approxi- 
mate Go and G1/G0, and we assumed Gaussian smooth- 
ing of a scale-free power spectrum, i.e. P(k) oc k n , for 
which 7 2 = (n + 3)/(n + 5) and (R*/R) 2 = 6/(n + 5). To 
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make the Figures, we set n = —1.2 and p = 1 to highlight 
the effects of p. 

Figure [3] shows that peaks in 8 and 8 — g do indeed 
produce different counts (short and long dashed curves, 
respectively); both are different from the result for all 
walks (dotted). And Figure [4] shows that the distribution 
given in equation (|36[) is indeed shifted to larger values of 
Six , with the shift depending weakly on the mass scale v c . 
This increase in Si x is qualitatively in the right direction, 
suggesting that identifying protohalos with peaks in 8 is 
a better model than one where protohalos are identified 
with peaks in 8 — g. However, the predicted distribution 
for peaks is not as different from that for all walks as is 
the difference seen in simulations between centre-of-mass 
walks and randomly chosen ones (the shift in the mean 
is not large enough, the width is not narrow enough, and 
the shape is not skewed enough). 

Before moving on, we note that, in the one- 
dimensional problem, the peaks motivated approach is 
attractive because it provides a natural reason why 
halo counts in simulations do not fall as steeply as 
exp(— 5c /2s) at small s. The two-Gaussian model here 
achieves this by setting /3 w 0.6 (see discussion at end of 
Section [22]). The analysis above indicates that peaks in 
this two- Gaussian model will require a smaller value of /3 
to reproduce the halo counts. Then reproducing the dis- 
tributions of p(5ix\s) and p p k(<5ix \s) provide important 
self-consistency tests. Since reducing /3 from the value 
used to make Figure [4] will only make all the curves 
there more similar to one another, this will exacerbate 
the discrepancies between model and simulations. Thus, 
our analysis suggests that neither of the peaks models we 
have considered here are consistent with measurements. 



4 DISCUSSION 

We described a two-dimensional excursion set model, for 
Gaussian walks in 8 and g, for which almost all quan- 
tities associated with first crossing distributions can be 
computed analytically. We have tested all the analytic 
expressions we provide in this paper using Monte-Carlo 
realizations of the two-dimensional stochastic process, 
finding excellent agreement. Since the analytic arguments 
are sufficiently simple, we have only included a few plots 
showing this agreement. 

Our predictions include the unconditional first cross- 
ing distribution f(s\z) (Section 12. 2[) ; the conditional first 
crossing distribution for redshift z, f(s, z\S, Z), by walks 
which are known to have first crossed one another on 
scale S < s at redshift Z < z (Section 12.5(1 : and the 
conditional distribution f(s,z\S, A) for walks which are 
constrained to have height A on scale S < s fSection l2,6|) . 
These are usually used to model halo abundances, pro- 
genitor distributions, and the environmental dependence 
of clustering. In the one-dimensional case, for appropri- 
ately chosen pairs of redshift and environment, the pro- 
genitor and conditional distributions are the same. For 
higher-dimensional walks, this is no longer the case: the 



conditional distributions generically predict more mas- 
sive objects (Figure [1] and related discussion). 

Another new feature of such higher-dimensional 
models is the fact that there is, generically, a distribution 
of walk heights at first crossing p(Si x \s) (Section l2.3[) , and 
an associated distribution of the other variable p(gix\s) 
(Section 12. 4|) . For the Gaussian walks considered here, 
these distributions are Gaussian, even when the barrier 
height depends on the first crossing scale s. We argued 
that s-dependence of the mean barrier height, with a 
Gaussian scatter around the mean, should provide a good 
approximation even when the walks are not Gaussian 
(Section CO}. 

We also argued that, because of the variable(s) which 
are not 8, halo bias in these models will generally be 
stochastic (sometimes refered to as nonlocal), and the 
conditional distributions will generically exhibit assembly 
bias, even when the steps in the walks are uncorrelated. 
We provided explicit expressions for both the stochas- 
tic (Section 12, 7[) and the assembly bias (Section 12.81 and 
Figure [2}. Although our model predicts no correlation 
between halo formation times and environment (at fixed 
halo mass), in agreement with the one-dimensional case, 
it nevertheless predicts that halos surrounded by over- 
densities should be denser and more concentrated than 
halos of the same mass in underdensities. 

The lack of correlation between time and environ- 
ment is a consequence of studying walks with uncorre- 
lated steps. We sketched how to generalize our results 
to include correlations between the steps in each walk 
(necessary for quantitative comparison with simulations; 
Section[37l}, and between the walks themselves (as might 
arise in models where the critical density required for col- 
lapse is determined by the overdensity on large scales; 
Section 13. 2[) . These will introduce additional assembly 
bias effects, for the same reasons they do so for one- 
dimensional walks. Although we sketched how to quantify 
these here, we did not show plots or otherwise quantify 
these effects for the following reason. 

One of the drawbacks of this model - that is in com- 
mon with the usual one-dimensional walk approach - is 
that it is explicitly about the statistics of all points in 
space. However, halos form around special positions in 
space, and the statistics of this point process - arguably 
the point process for which the description of the physics 
is simplest - is ve ry different from that around randomly 
chosen positions dSheth et all 1200 ll : IParaniape fc ShethL 
120121 : lAchitouv et all I2013T ). We argued that that the 
simplest case, in which halos form around positions which 
are peaks in 8 — g, cannot explain this difference (Sec- 
tion [3T4J . Although a model in which halos form around 
peaks in 8 fares better (Figures [3] and [2), it fails to 
adequately model the differences between walks centred 
on all particles, and those centred on the special subset 
which are protohalo centers of mass. Work in progress 
shows how to extend this approach to include a more 
elaborate model for protohalo centers-of-mass, but we 
believe our results demonstrate the power of requiring 
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models to reproduce not just halo counts but the distri- 
bution of S at fixed halo mass as well. 
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APPENDIX A: PROOF OF EQUATION (15) 

The main complication with respect to the one dimen- 
sional case is that the constraint that the walk passed 
through A on scale S still allows walks with a range of 
values of G. This range is constrained by the requirement 
that A and G had not crossed on scales smaller than S. 
At fixed A and G, the solution is straightforward, as we 
show shortly, so the main work is to integrate this solu- 
tion over the allowed range of G. 

As before, it is best to work in the (g+, <?_) plane, in 
which case the requirement that the walk has height A 
on scale S means that 

G- < M£L and G+ = AJTT7P-G- (Al) 

(again capital letters indicate values at 5"). The distribu- 
tion of s at which 5 c (s) j \J\ + fP is first crossed, given 
that the walk started from (G+, G-) on scale S, is given 
by equation © with 



(A2) 



s-S 

Notice that this expression depends only on G-, so we 
will denote the associated first crossing distribution as 
f(s\G-,S). 

To get the quantity we are after, /(s| A, S), we must 
now integrate f(s\G-,S) over all allowed starting val- 
ues (G+, G_), weighting by the probability of starting at 
each. I.e., 



f(s\A,S)=A 

where 

q(G-\S) 



dG+ / dG-f(s\G-,S) 
x p(G+\S) q(G-\S) So(S - A) (A3) 



-G1/2S 



-(2<5 c /Vl+/3 2 -G_) 2 /2S 



(A4) 

v2ttS V2nS 

is the probability that (the one-dimensional) walk g- has 
height G- at S and never cross ed S c /^/ 1 + f3 2 on some 
smaller s < S (|Bond et alj , ll99ll ). p(G+\S) is a Gaussian 
with zero mean and variance S, and 



A 



dGj 



dG_ p(G+\S) q(G-\S) So(S-A) 
(A5) 

is a normalization constant which ensures that the prob- 
abilities integrate to unity. This, and the integral in 
cq. (IA3[l can be performed analytically, yielding 

(6c - A)(l + I3 2 ) exp -(*c-A) 2 /2( s -S+^ S ) 



f(s\A,S) 



s-S + f3 2 s 



y/2n(s-S + P 2 s) 



(A6) 

This is equivalent to the change of variables given by 
equation (|15[) of the main text. 
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